-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(datasets): create separate ibis.FileDataset
#842
Conversation
88f74cb
to
c8f076e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
f5436cb
to
af42be8
Compare
One of the downsides of having a separate I can explore this in a future PR; this is probably good enough for now. |
ad21e5a
to
fe56964
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is 99% there, I'd prefer some extra bits of clarity so that @deepyaman 's bus factor isn't as high when it comes to maintainability
Very excited to get this in 💪
ff0b85a
to
bce8c57
Compare
2aa5bcc
to
27a4f4b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Refs: b7ff0c7 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
27a4f4b
to
b43163e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, it's a nice addition to the ibis datasets
* feat(datasets): create separate `ibis.FileDataset` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): deprecate `TableDataset` file I/O Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * feat(datasets): implement `FileDataset` versioning Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): try `os.path.exists`, for Windows Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * revert(datasets): use pathlib, ignore Windows test Refs: b7ff0c7 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): add `ibis.FileDataset` to contents Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): add docstring for `hashable` func Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): add docstring for `hashable` func Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * feat(datasets)!: expose `load` and `save` publicly Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): remove second filepath assignment Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tdhooghe <thomas_dhooghe@mckinsey.com>
* feat(datasets): create separate `ibis.FileDataset` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): deprecate `TableDataset` file I/O Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * feat(datasets): implement `FileDataset` versioning Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): try `os.path.exists`, for Windows Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * revert(datasets): use pathlib, ignore Windows test Refs: b7ff0c7 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): add `ibis.FileDataset` to contents Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): add docstring for `hashable` func Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): add docstring for `hashable` func Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * feat(datasets)!: expose `load` and `save` publicly Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): remove second filepath assignment Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tdhooghe <thomas_dhooghe@mckinsey.com>
Description
Resolves #828
Development notes
So far, I copied
ibis.TableDataset
, removing code paths for reading database tables, and adding support for file export.Just wanted to put this out there for early feedback, but what else should be done?
TableDataset
FileDataset
)FileDataset
to toctree, etc.)Update: Versioning is actually not a trivial subject, because backends don't implement a consistent interface for checking whether a file exists. I plan to do this with PyArrow filesystem, but I will do that in a follow-up PR (to limit complexity added here); this PR handles local versioning.
Checklist
RELEASE.md
file